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Whole-Genome Sequencing of Porcine Epidemic Diarrhea 
Virus by lllumina MiSeq Platform 

Leyi Wang, Tod Stuber, Patrick Camp, Suelee Robbe-Austerman, 
and Yan Zhang 

Abstract 

Porcine epidemic diarrhea vims (PEDV) belongs to the genus Alphacoronavirusoi the family Coronaviridae. 
PEDV was identified as an emerging pathogen in US pig populations in 2013. Since then, this virus has 
been detected in at least 31 states in the USA and has caused significant economic loss to the swine indus¬ 
try. Active surveillance and characterization of PEDV are essential for monitoring the virus. Obtaining 
comprehensive information about the PEDV genome can improve our understanding of the evolution of 
PEDV viruses, and the emergence of new strains, and can enhance vaccine designs. In this chapter, both a 
targeted amplification method and a random-priming method are described to amplify the complete 
genome of PEDV for sequencing using the MiSeq platform. Overall, this protocol provides a useful two¬ 
pronged approach to complete whole-genome sequences of PEDV depending on the amount of virus in 
the clinical samples. 
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1 Introduction 


Complete genome sequencing and genetic analysis significantly 
improved our understanding of the evolution and relationship of 
porcine epidemic diarrhea virus (PEDV) strains worldwide. The 
first PEDV whole-genome sequence was completed for the proto¬ 
type strain CV777 in 2001 [1]. Since then, several PEDV strains 
have been sequenced and now over 170 whole-genome sequences 
have been deposited in GenBank. Based on the phylogenetic analy¬ 
sis of the whole-genome sequence, PEDV has been classified into 
two Genogroups—1 and 2—which the variant and classical strains 
of US PEDV belong to, respectively [2]. 

Since the first 454 FLX pyrosequencing platform was intro¬ 
duced to the market in 2005, next-generation sequencing (NGS) 
has significantly advanced research in diverse fields. NGS has the 
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advantages of high-throughput and cost-effectiveness. Currently, 
there are several platforms available, including the Genome 
Analyser developed by Illumina/Solexa, and the Personal Genome 
Machine (PGM) by Ion Torrent. One of Illumina NGS platforms— 
MiSeq—is commonly used in diagnostic laboratories. 

In general, sequencing viruses directly from fecal samples for 
PEDV are technically challenging without prior amplification with 
specific primers. In this chapter, we describe a useful two-pronged 
approach where the random-priming method can be used to 
sequence the complete PEDV genome from samples with Ct val¬ 
ues of less than 15, whereas the targeted amplification method is 
recommended to be used to sequence clinical fecal samples with 
higher Ct values (low viral loads). 


2 Materials 

2.1 RNA Extraction 
from Feces or 
Intestinal Contents 

2.2 Real-Time 
Reverse Transcriptase 
Polymerase Chain 
Reaction (RT-PCR) 
Reaction 


2.3 Targeted 
Amplification One- 
Step RT-PCR 


2.4 SISPA Method 
(Sequence- 
Independent, Single- 
Primer Amplification) 


1. MagMAX Pathogen RNA Kit (Life Technologies) for viral 
RNA extraction from fecal or intestinal contents (see Note 1). 

2. HBSS (GIBCO). 

1. One-Step RT-PCR kit (Qiagen). 

2. Smart Cycler II (Cepheid, Sunnyvale CA). 

3. 10 pmol Primers and probes [3] (see Note 2). 

Forward primer: 5' - CATGGGCTAGCTTTCAGGTC- 3'. 
Reverse primer: 5'-CGGCCCATCACAGAAGTAGT-3'. 

Probe: 5756-FAM/CATTCTTGGTGGTCT TTCAAT 

CCTGA/ZEN 3IABkFQ/3\ 

1. One-Step RT-PCR kit (Qiagen) (j^Note 3). 

2. Oligonucleotide primers dissolved in nuclease-free water to a 
stock concentration of 100 pmol/pl and a working concentra¬ 
tion of 20 pmol/pl. The sequences for 19 pairs of primers are 
listed in Table 1 . 

3. Qiagen gel purification kit. 

4. Qubit 2.0 Fluorometer (Life Technologies). 

1. Superscript III Reverse Transcriptase kit (Invitrogen). 

2. RNase H treatment (NEB). 

3. Klenow amplification (NEB). 

4. Advantage 2 PCRkit (Clontech). 

5. 10 mM dNTP mix (NEB). 

6. RNase Inhibitor (Promega). 
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Table 1 

Nineteen pairs of primers for whole-genome amplification of PEDV 


Fragment no. 

Sequence 

Sense 

Size 

FI 

ACTTAAAAAGATTTT CTAT CTAC 

Forward 

1622 


CGTTAACGATACTAAGAGT GGC 

Reverse 


F2 

T GGT GACCTT GCAAGT GCAGC 

Forward 

1603 


ATTACCAACAGCCTTATTAAGC 

Reverse 


F3 

ACCATT GACCCAGTTTATAAGG 

Forward 

1587 


ACAAAAGCACTTACAGT GGC 

Reverse 


F4 

TACACCTTT GATTAGT GTT GG 

Forward 

1614 


TTT GTAGCGT CTAACT CTAC 

Reverse 


F5 

GTACCAGGT GAT CT CAAT GT G 

Forward 

1615 


ACGT GGCAAT GT CAT GGACG 

Reverse 


F6 

AT GCT GCT GTT GCT GAGGCT C 

Forward 

1600 


T CAGTT GAGATAGAGTT GGC 

Reverse 


F 7 

GT GACAAGTTCGTAGGCT C 

Forward 

1597 


TAAGT GACAGAACT CACAGG 

Reverse 


F8 

T GCACAAGGT CTT GTTAACAT C 

Forward 

1601 


T CT GT GCACCATTAGGAGAAT C 

Reverse 


F9 

ACCT GCGT GTAGT CAAGT GG 

Forward 

1599 


GTTACCAGT GGAACACCAT C 

Reverse 


F10 

ACT GT GCCAACTT CAATACG 

Forward 

1611 


T CAT CAACAAACACACCT GC 

Reverse 


Fll 

T GCTCGCAGCATACTAT GCAG 

Forward 

1588 


GT GGT GCAGGCAGCT GTT GAG 

Reverse 


F12 

T CTAT GT GCACTAATTAT GAC 

Forward 

1599 


T GATT GCACAATT CGGCCGC 

Reverse 


F13 

CCATACAT GATT GCTTT GT C 

Forward 

1595 


AT CGT CAAGCAGGAGAT CC 

Reverse 


F14 

T GT CTAGTAAT GATAGCACG 

Forward 

1647 


TTAT CCCAT GTTAT GCCGAC 

Reverse 


F15 

TAAT GAT GTTACAACAGGTCG 

Forward 

1554 


AAGCCATAGATAGTATACTT G 

Reverse 


F16 

T GAGTT GATTACT GGCACGCC 

Forward 

1598 


GTACT GTAT GTAAAAACAGCAG 

Reverse 



(continued) 
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Table 1 
(continued) 


Fragment no. 

Sequence 

Sense 

Size 

F17 

AT CGCAAT CT CAGCGTTAT G 

Forward 

1596 


GT GTAAACT GCGCTATTACAC 

Reverse 


F18 

CTGCTTATTATAAGCATTAC 

Forward 

1603 


GCTT CT GCT GTT GCTTAAGC 

Reverse 


F19 

AGT CTCGTAACCAGTCCAAG 

Forward 

1065 


TTTTTTTTTTTT GT GTAT CCAT 

Reverse 



7. Oligonucleotide primers [4] dissolved in nuclease-free water 
to a concentration of 50 pM, 1 pM, and 10 pM for PI, P2, and 
P3, respectively. 


PI: GAC CAT CTA GCG ACC TCC ACN NNN NNN N. 
P2: GAC CAT CTA GCG ACC TCC AC TTT TTTTTTT 

P3: GAC CAT CTA GCG ACC TCC AC. 


8. QIAquick PCR Purification Kit (Qiagen). 


2.5 Detection of PCR 1. Agarose. 

Products 2. Distilled water. 

3. Ethidium bromide. 

4. lx TAE buffer: 40 mM Tris, 20 mM acetic acid, 1 mM EDTA. 


2.6 lllumina Nextera 
DNA Library 
Preparation 


2.7 Next-Generation 
Sequencing 


1. Nextera XT Library Prep Kit 96 samples (Box 1 of 2) (lllumina). 

2. Nextera XT Library Prep Kit 96 samples (Box 2 of 2) (lllumina). 

3. Nextera XT Index Kit 96 indexes-192 samples. 

4. Agencourt AMPure XP beads (Beckman Coulter). 

5. 96-well PCR plate (Scientific Inc.). 

6. 96 Deep Well Block (Invitrogen). 

7. Microseal “B” adhesive seals (BioRad). 

8. Magnetic plate stand-96 (Life, Technologies). 

9. Ethanol, 200 proof (Sigma-Aldrich). 

1. MiSeqv2 Reagent Kit 500 cycles PE-Box 1 of 2 (lllumina). 

2. MiSeqv2 Reagent Kit Box 2 of 2 (lllumina). 

3. MiSeq (lllumina). 
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2.8 Sequence 
Assembly 
and Analysis 


3 Methods 

3.1 Viral RNA 
Extraction 


3.2 Real-Time 
RT-PCR Reaction 


3.3 One-Step RT-PCR 
Reaction 


1. Kraken. 

2. Krona. 

3. BWA—Burrows-Wheeler Alignment Tool. 

4. SAMtools. 

5. Picard. 

6. BLAST. 

7. BioPython. 

8. GATK—Genome Analysis Toolkit. 

9. R. 

10. IGV. 


1. Fecal or intestinal contents were diluted in HBSS to a final 
concentration of 20 % and were homogenized by five stainless 
steel balls followed by a centrifuge step at 2000 RCF at 4 °C 
for 5 min. 

2. The supernatant was used for RNA extraction by using the 
MagMAX Pathogen RNA/DNA Kit (Life Technologies) 
(see Note 2). 

1. Real-time RT-PCR with a 25 pi reaction volume was com¬ 
pleted using QIAGEN one-step RT-PCR kit: 5 pi 5x RT-PCR 
buffer, 0.5 pi forward primer (10 pmol), 0.5 pi reverse primer 
(10 pmol), 0.5 pi probe (10 pmol), 1 pi dNTP, 1 pi enzyme 
mix, 0.2 pi RNasin inhibitor (40 Unit/pl, Promega), and RNA 
temple: 2.5 pi. 

2. The amplification conditions were 50 °C for 30 min; 95 °C for 
15 min; and 45 cycles of 94 °C, 15 s, and 60 °C, 45 s. 

1. RT-PCR with a 25 pi reaction volume was completed using 
QIAGEN one-step RT-PCR kit: 5 pi 5x RT-PCR buffer, 0.8 pi 
forward primer (20 pmol), 0.8 pi reverse primer (20 pmol) 
(Table 1), 1 pi dNTP, 1 pi enzyme mix, 0.2 pi RNasin inhibitor 
(40 Unit/pl, Promega), and RNA temple: 2.5 pi. 

2. The amplification conditions were 50 °C for 30 min; 95 °C for 
15 min; and 45 cycles of 94 °C, 30 s, 54 °C, 30 s, and 72 °C, 
1 min 30 s. 

3. Analyze the PCR products on a 1 % agarose gel and migrate for 
1 h at 90 V. 

4. Excise the correct size bands and perform gel purification with 
a Qiagen gel purification kit (see Note 4). 
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3.4 SISPA Method 
(See Note 6) 

3.4.1 First-Strand 
Synthesis 


3.4.2 Klenow 
Amplification 


3.4.3 PCR Amplification 


3.5 Library 
Preparation Using 
Nextera XT Kit 


5. Quantify the DNA generated by a fluorescence-based method 
(Qubit 2.0 Fluorometer) and final amount of DNA input as 
1 pg (see Note 5). 

1. 1 pi of 50 pM random primer PI; 1 pi of 1 pM oligo dT primer 
P2; 1 pi 10 mM dNTP mix; 10 pg-5 pg of RNA template. Add 
water up to 13 pi total volume. 

2. Incubate the reaction at 65 °C for 5 min and incubate on ice 
for at least 1 min. 

3. Add 4 pi 5x first-strand buffer; 1 pi 0.1 M DTT; 1 pi RNase 
inhibitor; 1 pi of Superscript III Reverse Transcriptase. 

4. Incubate the reaction at 25 °C for 5 min, 50 °C for 30-60 min, 
and 70 °C for 15 min. 

5. Add 1 pi RNase H (NEB) to the reaction. 

6. Incubate at 37 °C for 20 min. 

1. Add 3 pi lOx Klenow reaction buffer; 1 pi of 25 pmol dNTP; 
and 1 pi of 1 pM random primer PI to the reaction in 
Sect. 3.4.1. 

2. Incubate at 95 °C for 2 min and cool to 4 °C. 

3. Add 1 pi Klenow fragment (NEB). 

4. Incubate at 37 °C for 60 min and 75 °C for 20 min. 

1. 5 pi lOx Advantage 2 PCR buffer; 1 pi 5Ox dNTP mix; 2 pi 
10 pM barcode primer P3; 1 pi 5Ox Advantage 2 Polymerase 
Mix; DNA template from Klenow amplification. Add water up 
to 50 pi total volume. 

2. Incubate the reaction using the following PCR program: 1 
cycle: 95 °C 5 min; 5 cycles: 95 °C 1 min; 59 °C 1 min; 68 °C 
1 min 10 s; 33 cycles: 95 °C 20 s; 59 °C 20 s; 68 °C 1 min 
30 s; 1 cycle: 68 °C 10 min. 

3. Use 5 pi to analyze the PCR products on a 1 % agarose gel and 
migrate for 1 h at 90 V (see Note 7). 

4. Use QIAquick PCR Purification Kit to purify the remaining 
45 pi. 

5. Quantify the DNA generated by a fluorescence-based method 
(Qubit 2.0 Fluorometer) and final amount of DNA input as 
1 Mg- 

1. Perform the library preparation based on the Illumina com¬ 
pany manual, which includes tagmentation of genomic DNA, 
PCR amplification, PCR cleanup, library normalization, and 
final library pooling for MiSeq sequencing. 
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3.6 Sequence 
Assembly 
and Analysis 


1. Kraken is used to initially identify raw reads and provide a 
graphical representation of the reads using Krona. A custom 
Kraken database is used. It was built using the standard data¬ 
base containing all Ref Seq bacteria and virus genomes along 
with all complete swine enteric coronavirus disease (SECD) 
genomes available at NCBI and a pig genome. 

2. Raw reads are run through an in-house custom shell script. In 
brief, 18 complete genomes from NCBI representing the 4 
SECD virus species (TGEV, PRCV, PEDV, and PDCoV) are 
used as references to align raw reads. A function is looped 3x. 
This function aligns and removes duplicates, creates a VCF, 
updates reference with VCF information, and performs a 
BLAST search against the nt database using the updated refer¬ 
ence. Eighteen complete genomes are used to start the initial 
loop. From this first loop the top hit returned is used as the 
reference for the next loop. A total of three loops are per¬ 
formed to find the best reference. 

3. After the best reference has been found, alignment metrics 
including read counts, mean depth of coverage, and percent of 
genome with coverage are collected. 

4. Reports summarizing the alignment metrics (Fig. 1) along 
with Kraken identification interactive Krona HTML file, a 
FASTA of assembled genome, and depth of coverage profile 
graph (Fig. 2) are e-mailed to concerned individuals. 

5. The assembled FASTA file can be visually verified in IGV using 
the BAM and VCF output from the script. If necessary the 
FASTA can be corrected in program of choice. 

6. Script details are provided on GitHub (https://github.com/ 
USDA-VS/public/blob/master/secd/idvirus.sh). 


SECD 

Rl file also; 45IK, read edun-tt M4J88 
B.Z file Iliii 451H, r«id count* 514188 
223.955 virus roads —B 24.49774 of ratal roads 

Aliqrunur.t StSts (reference guided)* 

reference usad rend Count percent cav ove depth 

rare irc_epi. dearc_diar.r;icn_vi run3 4 6-, 92 5 9 9.361 2,7641 

Poreirr Reap_Carona Virun-E>QH 117 87 6B-.97B 50.-9 31 7131 


• * * jer Database *+* 
cjJdi'If ID 

Porclne_epid«ilc_diarrJieA_vliriis-KJ39S978 
Porcl n®_Resp_Copona_Virua-I>iJBl 1787 


glon^thi sVertgth i id nle avalue becore Description 

20025 26029 99.90 1 8.0 51729 Porcine epidemic dierrites virus strain CH051, ccnpiete gonowt KJ39997B.1 

286-17 27550 93.00 1554 0.0 -38770 PRCV ISU-I r CMiplete -^eiWffte DOS 11 IS7,1 


Fig. 1 Report summary for sample 0H851 -RP-Virus. The reference set used to initiate the shell script was 
SECD (swine enteric coronavirus diseases). File size and read counts for each fastq file are shown. Provided 
by Kraken, 223,955 virus reads were identified. “Reference used” is the closest finding in the NCBI nt data¬ 
base. The read count shows the number of raw reads shown to match the reference. “Percent cov” shows the 
percent of reference having coverage. A coverage of 98.36 % for PEDV indicates a true find relative to the 
sporadic <51 % coverage seen from PRCV, although in this case, the presence of PRCV cannot be ruled out. 
There were no reads matching TGEV and PDCoV, which were not shown. Because of the high percent of 
genome coverage, the completed reference-guided assembly for PEDV was BLAST against the nt database to 
provide mismatches, e-value, and bit score against the most closely related publicly available genome 
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OH851-RP-Virus 



species 

^ porcine_epidemic_diarrhea_virus-KJ399978 
porcine_Resp_Corona_virus-DQ811787 


Fig. 2 The depth of coverage profiles for sample 0H851 -RP-Virus. The v-axis is the genome position. The 
y-axis is the log depth of coverage. Reads matching any of the four SECD target viral species are shown 


4 Notes 


1. The MagMAX Pathogen RNA/DNA Kit was used for the 
extraction of nucleic acid for pathogen detection—including 
the detection of TGEV, PEDV, and PDCoV—from pig feces 
or intestinal contents. 

2. The real-time RT-PCR assay was developed by our laboratory 
and the primers and probes target the M gene of the virus. 

3. In our laboratory, the QIAGEN one-step RT-PCR kit has 
been used to amplify RT-PCR products between 100 bp and 
1.8 kb in length. 

4. Alternatively, 5 pi out of 25 pi could be loaded on the gel to 
confirm that each target band is amplified and then the remain¬ 
ing 20 pi can be purified by QIAquick PCR Purification Kit. 

5. A smaller amount of input DNA than the required 1 pg for 
targeted amplification method could be used to avoid over¬ 
whelming sequence reads. 

6. The SISPA method is recommended when the Ct value of real¬ 
time RT-PCR is below 15. 

7. When 5 pi was applied to the gel, a smear bank can be observed. 
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